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1  introduction 


The  Programmable  Gate  Array  (PGA)  is  an  exciting  new 
idea  in  semi-custom  integrated  circuits  that  reduces  the  IC 
manufacturing  time  from  months  to  minutes  and  prototype  cost 
from  tens  of  kilodollars  to  under  $100.'  The  PGA  was 
introduced  in  [Cart86]  and  newer  vqrSoftsrfiave  been  presented 
in  [Hsie87Jfsie88,ElGa88aJElAy88]^t  is  similar  to  a  gate  array 
in  structure,  but  can  be  field-programmed  to  specify  the 
function  of  the  basic  logic  blocks  and  their  interconnection. 
This  paper  studies  the  effect  of  logic  block  complexity  on  total 

circuit  area  for  PGAs.  /j  -  - 

_ _  _ y  r  /  '  if 

The  architecture  of  a  PGA  consists  of  its  logic  block 
function,  interconnection  scheme,  XriZf  block  design  and  the 
global  structure.  There  are  many  tradeoffs  between  architecture, 
area,  and  speed,  each  of  which  depends  heavily  on  the 
programming  technology.  Programming  technology  is  the 
underlying  method  by  which  the  logic  function  is  set  and  the 
connections  are  implemented  at  program  time.  For  example,  the 
programming  technology  used  in  [Hsie8$pis  based  on  static 
RAM  and transistors,  while  that  of  [ElGa88a]  uses  an  anti- 
fuse.  In  this  paper  vwe  focua^on  the  effect  of  logic  block 
complexity  on  PGA  area,  ignoring  speed  considerations.  While 
circuit  speed  is  very  important,  this  wotk  represents  an  initial 
exploration  into  plausible  architectures  from  an  area  perspective. 

We  address  two  questions:  First,  should  the  basic  logic 
block  contain  a  significant  amount  of  fixed  hardware,  such  as  a 
D  flip-flop?  Our  experimental  results  indicate  that  a  D  flip-flop  is 
desirable  for  large  programming  technologies  (like  SRAM 
[Hsie88])  but  that  it  is  inefficient  for  smaller  technologies  such 
as  the  anti-fuse.  Second,  for  logic  blocks  containing  arbitrary 
combinational  logic  functions,  (Le.  any  K  to  1  logic  function) 
what  is  the  best  number  (K)  of  inputs  to  use?  Surprisingly,  the 
best  number  of  inputs  remains  nearly  constant  over  a  wide  range 
of  programming  technologies  and  was  almost  the  same  whether 
or  not  the  block  contained  a  D  flip-flop. 


To  answer  these  questions,  our  approach  is  to  implement  a 
set  of  circuits  in  a  variety  of  logic  blocks  and  programming 
technologies,  and  determine  the  area  required  for  each.  This 
data  will  indicate  an  appropriate  choice  of  logic  block,  in  terms 
of  area,  for  a  given  technology. 


Figure  1  -  General  Model  of  Logic  Block 


Figure  1  depicts  the  general  architectural  model  used  for  the 
logic  block.  It  consists  of  a  K-input  arbitrary  combinational 
logic  function  (referred  to  as  an  "Arb-K"),  connected  to  a  D 
flip-flop  followed  by  a  multiplexer  that  selects  either  the  flip-flop 
output  or  the  Arb-K  output.  Its  output  is  passed  to  a  tristate 
driver  that  can  be  enabled  by  another  input  or  left  permanently 
on.  To  determine  if  the  D  flip-flop  is  beneficial,  two  variations 
of  this  basic  model  will  be  considered:  one  that  contains  the  D 
flip-flop,  and  one  that  does  not 

The  global  architecture  of  the  PGA  under  consideration  is 
shown  in  Figure  2.  It  is  a  regular  array  of  logic  blocks, 
separated  by  horizontal  and  vertical  routing  channels.  The 
number  of  tracks  in  all  of  the  routing  channels,  W,  is  the  same. 
Since  we  want  to  know  the  area  requirements  of  a  logic  block 
architecture,  a  crucial  concept  in  this  procedure  is  that  W  is 
determined  by  the  placement  and  routing  for  each  circuit 

The  following  procedure  performs  the  circuit 
implementation: 

Input:  a  logic  circuit,  a  range  of  K’s  indicating  how  many  inputs 
on  the  Arb-K  block,  and  a  set  of  programming  technologies. 


This  work  was  supported  by  DARPA  Contract  #N00014-87-K-0828,  and 
NSERC  Operating  Grants  #A4029  and  #A4033. 


oferoMmON  stateRSTTa 

-  1  - 

~  o  o 

A 

n  v  n 

Approve-!  f  -  public  ralaoM; 

V  '  o 

<  w 

u  i  tl 

DUdrT.  v'io  >.  i  Virr.'tod 

Jonathan  Rose,  Center  for  Integrated  Systems,  Stafford,  CA  94305  (415)— 725-3635 


W  tret*  pmt  C»»nn»> 


Figure  2  -  Routing  Model  of  PGA 

Output:  for  each  (K,  programming  technology)  pair  the  area 
required  to  implement  the  circuit  with  a  logic  block  that  contains 
a  D  flip-flop,  and  the  area  for  a  logic  block  that  does  not. 

Procedure:  For  each  logic  block  type: 

1.  Partition  the  original  circuit  into  the  current  logic  block. 
This  is  sometimes  called  technology  mapping  [Detj87],  but 
is  a  more  difficult  problem  for  PGAs  because  each  logic 
block  can  collapse  many  combinational  logic  functions.  The 
Chortle  program  was  developed  to  do  this  mapping 
[Fran88].  It  uses  a  greedy  algorithm  that  tries  to  collapse  as 
many  standard  cells  as  it  can  into  each  logic  block. 

2.  Perform  the  placement  of  the  resulting  circuit.  This  is  done 
using  the  Altor  placement  program  [Rose85],  which  is  based 
on  the  min-cut  placement  algorithm  [Breu77],  Altor  makes 
the  array  as  square  as  possible. 

3.  Perform  the  global  routing  of  the  circuit.  Global  routing 
determines  the  path  of  channels  that  each  wire  is  to  take,  and 
hence  determines  the  maximum  number  of  tracks  required  in 
each  channel,  W .  The  algorithm  used  is  similar  to  the  one 
described  in  [Rose88],  but  is  changed  to  fit  the  routing 
model  pictured  in  Figure  2. 

4.  Section  3  describes  a  model  for  the  logic  block  area  and 
routing  area  as  a  function  of  K  and  programming 
technology.  With  this  model,  W,  and  the  placement 
dimensions,  the  circuit  area  for  a  range  of  programming 
technologies  is  calculated. 

The  above  procedure  makes  the  approximation  that  the 
global  routing  track  count  determines  the  number  of  tracks 
required  in  a  channel.  This  is  generally  accepted  as  true  for 
unconstrained  channel  routers,  but  may  not  be  true  for  switch- 
based  routing  schemes.  We  have  reason  to  believe,  however, 
that  the  error  in  this  assumption  is  only  a  few  tracks  [ElGa88b], 


3  Architecture  Model 

The  area  of  a  logic  block  is  a  function  of  the  number  of  its 
inputs,  the  amount  of  fixed  hardware  it  contains,  and  the 
programming  technology.  The  pitch  of  the  routing  track  can  be 
approximately  modeled  as  a  function  of  the  programming 
technology. 

The  programming  technology  is  represented  by  one 
parameter:  the  area  required  to  store  one  bit  in  the  technology, 
or  Bit  Area  (BA).  For  example,  in  the  Xilinx  PGA  [Hsie88],  the 
Bit  Area  is  the  area  of  a  static  RAM  bit.  In  the  Actel  PGA 
[ElGa88a]  the  bit  area  is  much  smaller,  close  to  the  space 
required  by  an  anti-fuse.  The  overhead  required  to  access  the 
Arb-K  block  and  the  area  required  by  the  D  flip-flop  (if  it  is 
present)  and  all  other  n on-arbitrary  logic  function  hardware  is 
represented  by  a  second  parameter,  called  the  Fixed  Overhead 
Area  (FA). 

An  Arb-K  block,  because  it  can  implement  any  K  to  1  logic 
function,  requires  2*  bits  of  information  to  be  stored  and  so 
must  have  area  proportional  to  2*.  Using  this,  we  can  derive 
the  following  expression  for  logic  block  area: 

Logic  Block  Area  =  BA  x  2K  +  FA 

where  BA  is  the  bit  area  in  the  programming  technology  and  FA 
is  the  fixed  overhead. 

The  basic  technology  is  assumed  to  be  1.2S|im  CMOS.  FA 
has  been  calibrated  using  data  acquired  from  Xilinx  [Cart88], 
giving  FA  =  1200|im  2  for  logic  blocks  without  a  D  flip-flop  and 
16000|Xnz2  for  logic  blocks  with  a  D  flip-flop.  The 
corresponding  Bit  Area  for  an  SRAM  programming  technology 
is  400(J/n  2  and  is  roughly  40(l/n  2  for  an  anti-fuse  technology. 
In  our  experiments,  we  will  vary  the  Bit  Area  between  and 
above  these  two  values. 

Though  the  PGA  interconnection  scheme  is  not  addressed 
directly  in  this  paper,  the  area  required  by  routing  is  an 
important  factor  in  determining  the  logic  block.  We  need  to 
know  the  pitch  of  die  routing  track  as  a  function  of 
programming  technology.  Each  routing  track  will  need  at  least 
one  bit  of  information  in  it,  and  probably  several  —  to  determine 
if  a  set  of  switches  or  fuses  is  open  or  dosed.  Since  it  is  difficult 
to  lay  out  a  bit  with  highly  non-square  aspect  ratios,  the  pitch  of 
a  routing  track  is  approximated  as  the  square  root  of  the  area 
required  by  a  bit,  i.e.  Routing  Pitch  =  ''BA  . 

4  Experimental  Results 

The  circuits  used  in  these  experiments  are  five  standard-cell 
circuits  obtained  from  Bell -Northern  Research  [Mart88], 
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ranging  in  size  from  420  to  168 1  standard  cells.  They  consist  of 
a  mix  of  random  logic  and  data  path  circuits.  Figure  3  is  a  plot 
of  absolute  area  for  the  PGA  versus  number  of  inputs  to  the 
arbitrary  combinational  logic  block.  K,  for  a  1073  standard  cell 
circuit. 


K 

Figure  3  -  Area  for  versus  K  for  One  Circuit 

There  are  two  curves  -  one  giving  area  when  the  logic  block 
contains  a  D  flip-flop,  and  one  without.  The  programming 
technology,  BA  =  415\im 2,  corresponds  to  an  SRAM-based 
approach  [Hsie88].  Using  similar  data  for  all  of  the  circuits,  with 
more  programming  technologies,  the  questions  raised  in  the 
introduction  were  addressed. 

4.1  Number  of  Inputs  to  Logic  Block 

Figure  4  shows  the  sum  of  the  normalized  areas  over  all  of 
the  circuits,  versus  K.  The  normalized  area  for  a  circuit  is 
determined  by  dividing  the  area  using  logic  block  K  by  the  best 
area  over  all  K.  The  logic  block  used  in  this  data  contains  a  D 
flip-flop.  Figure  4  gives  several  plots  for  different  bit  areas 
(programming  technologies). 


K 

Figure  4  -  Sum  of  Normalized  Areas  versus  K  Using  DFF 


function  of  the  bit  area. 

The  number  of  logic  blocks  increases  when  the  logic  block 
has  no  flip-flop  because  the  D  flip-flops  must  then  be 
implemented  in  combinational  gates.  Since  the  size  of  each  logic 
block  is  less,  the  final  area  may  or  may  not  be  smaller.  Figure  S 
is  a  normalized  area  plot  for  logic  blocks  that  do  not  contain  D 
flip-flops.  The  best  number  of  inputs  in  this  case  is  three,  only 
slightly  different  than  the  D  flip-flop  case.  Again,  this  number  is 
independent  of  programming  technology. 


K 

Figure  5  -  Sum  of  Normalized  Areas  versus  K  Without  DFF 

In  both  cases,  the  best  K  is  low  (3  or  4)  primarily  because 
the  circuits  cannot  make  effective  use  of  the  larger  K  blocks, 
because  the  increasing  functionality  comes  at  the  cost  of  a  much 
greater  active  area,  which  exponentially  increases  in  K,  it 
doesn't  pay  to  use  the  larger  logic  blocks. 

4.2  Utility  of  the  D  Flip-Flop 

Figure  6  is  plot  of  circuit  area  using  and  not  using  a  D  flip- 
flop  versus  Bit  Area  for  a  1073  standard  cell  circuit.  The  circuit 
area  used  is  the  one  obtained  with  the  lowest  area  K. 


It  is  clear,  from  the  dip  at  K  =  4,  that  a  4-input  arbitrary  R9Urd  6  * Arta  versus  Bit  1  Circuit 

logic  block  consistently  achieves  the  lowest  area.  Surprisingly, 

this  number  is  constant  over  a  wide  range  of  bit  areas.  It  is  due  This  fi*ure  shows  **.  for  very  small  bit  areas,  it  is 

to  the  fact  that,  for  a  given  K,  the  area  is  predominantly  a  linear  ^wtageous  not  to  use  a  D  flip  flop,  but  larger  bit  areas 
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perform  better  by  using  a.  D  flip  flop.  This  is  true  for  all  of  the 
circuits,  but  the  cross-over  point  is  different  for  each. 

t~ _ i  ■  ,  ,  -  Area  Without  Flip-Flop 

Figure  7  as  a  plot  of  "Ar,a  Wuh  Fllp!flo/  versus  Bit 

Area  for  each  circuit,  indicating  when  it  is  advantageous  to  use  a 
flip-flop.  In  the  smaller  bit  areas,  corresponding  to  an  anti-fuse 
programming  technology,  the  use  of  a  D  flip  flop  is  unprofitable. 
This  is  the  case  in  the  Actel  PGA  [ElGa88].  The  middle  and 
larger  bit  areas,  corresponding  to  the  SRAM  program 
technology,  can  benefit  by  including  a  D  flip-flop,  and  in  fact  the 
Xilinx  PGA  [Hsie88]  uses  two  D  flip-flops. 


Ratio  of  Area 
Without  DFF  to  With  DFF 


100  415  800  1600 

Bit  Area 

Figure  7  -  Without  DFF:With  DFF  versus  Bit  Area 


5  Conclusions  and  Future  Work 

We  have  presented  a  procedure  and  model  to  evaluate 
different  logic  block  architectures  for  Programmable  Gate 
Arrays,  on  the  basis  of  circuit  area.  Using  this  method,  for  a 
particular  set  of  circuits,  we  have  demonstrated  a  good  number 
of  inputs  to  use  for  the  arbitrary  combinational  logic  block.  In 
addition,  we  have  displayed  the  trade-off  between  programming 
technology  area  and  utility  of  a  D  flip-flop  in  the  logic  block. 

There  is  an  enormous  amount  of  future  work  to  be  done  in 
this  field.  We  would  like  to  investigate  more  kinds  of  logic 
blocks  -  in  particular  those  with  less  arbitrary  logic  functions. 
Other  work  will  directly  address  questions  dealing  with  circuit 
speed.  This  relates  to  the  specific  architecture  of  the 
interconnection  scheme.  All  of  this  wotk  needs  to  be 
implemented  on  a  wider  range  of  circuits.  New  CAD  algorithms 
need  to  be  developed  for  PGAs.  Our  technology  mapper  needs 
more  development,  and  the  placement  and  routing  needs  to 
address  the  specific  needs  of  PGAs.  PGAs,  because  they 
promise  such  enormous  economic  advantages,  are  a  fertile  and 
growing  field  of  research  and  development 
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